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ABSTRACT 

Plant specific SGS3-like proteins are composed of 
various combinations of an RNA-binding XS domain, 
a zinc-finger zf-XS domain, a coil-coil domain and a 
domain of unknown function called XH. In addition 
to being involved in de novo 2 (IDN2) and SGS3, the 
Ambidopsis genome encodes 12 uncharacterized 
SGS3-like proteins. Here, we show that a group of 
SGS3-like proteins act redundantly in RNA-directed 
DNA methylation (RdDM) pathway in Ambidopsis. 
Transcriptome co-expression analyses reveal sig- 
nificantly correlated expression of two SGS3-like 
proteins, factor of DNA methylation 1 (FDM1) and 
FDM2 with known genes required for RdDM. The 
fdml and fdm2 double mutations but not the fdml 
or fdm2 single mutations significantly impair DNA 
methylation at RdDM loci, release transcriptional 
gene silencing and dramatically reduce the abun- 
dance of siRNAs originated from high copy 
number repeats or transposons. Like IDN2 and 
SGS3, FDM1 binds dsRNAs with 5 overhangs. 
Double mutant analyses also reveal that IDN2 and 
three uncharacterized SGS3-like proteins FDM3, 
FDM4 and FDM5 have overlapping function with 
FDM1 in RdDM. Five FDM proteins and IDN2 define 
a group of SGS3-like proteins that possess all 
four-signature motifs in Ambidopsis. Thus, our 
results demonstrate that this group of SGS3-like 
proteins is an important component of RdDM. This 
study further enhances our understanding of the 
SGS3 gene family and the RdDM pathway. 

INTRODUCTION 

In many eukaryotes, RNA-directed DNA methylation 
(RdDM) is often associated with transcriptional silencing 
(TGS) and is considered as an essential mechanism to 



maintain genome stability and to suppress the prolifer- 
ation of transposable elements (1,2). A key component 
of RdDM is ~20-24nt small interfering RNA derived 
from transposon or repetitive sequences (rasiRNA) that 
associates with the argonaute (AGO) proteins to guide 
de novo cytosine methylation at its homolog loci (1,2). In 
Arabidopsis thaliana, the generation of 24-nt rasiRNAs 
depends on the RNA-dependent RNA polymerase 2 
(RDR2), dicer-like 3 (DCL3), the SNF2-like chromatin- 
remodeling factor classy 1 (CLSY1) and the plant specific 
DNA-dependent RNA polymerase IV (Pol IV) (3-6). Pol 
IV associates with siRNA-generating loci and is thought 
to generate single-stranded RNAs (ssRNAs) from these 
loci, which are presumably converted into double- 
stranded RNAs (dsRNAs) by RDR2 and subsequently 
processed by DCL3 into 24 nt rasiRNA duplex (3-7). 
CLSY1 is required for the correct localization of Pol IV 
and RDR2 (8). 

After generation, one strand of siRNA duplexes 
is loaded into AG04, AG06 or AG09 (9-11). 
Presumably through base-pairing between siRNA and 
Pol V-dependent transcripts and/or physical interaction 
with NRPE1, which is the largest subunit of Pol V, 
AG04 is guided to targets to recruit domains rearranged 
methyltransferase 2 (DRM2) to catalyze de novo cytosine 
DNA methylation at symmetric CG, CHG (H is adenine, 
thymine or cytosine) and asymmetric CHH context 
(12-14). It was recently shown that Pol II might recruit 
AG04, Pol IV and Pol V to chromatin through its tran- 
scripts or transcription activity at intergenic low copy 
number loci (7). Additional RdDM components include 
suppressor of Ty insertion 5-like (SPT5L, also known as 
KTF1), defective in RNA-directed DNA methylation 1 
(DRD1), defective in meristem silencing 3 (DMS3) and 
RNA-directed DNA methylation 1 (RDM1) (15-21). 
SPT5L interacts with both Pol V transcripts and AG04 
and is thought to act downstream of the RdDM pathway 
(16,18), whereas DRD1, DMS3 and RDM1 form a DDR 
complex that is required for the generation of Pol 
V-dependent transcripts (17,21). 
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The plant specific SGS3 gene family encodes proteins 
containing at least one of the following protein domains: 
XS, XH and zf-XS that were named after Arabidopsis 
SGS3 and its rice homolog XI (22,23). Among these 
protein domains, the XS domain is an RNA-binding 
domain, zf-XS domain is a C2H2 type zinc finger 
domain and XH domain refers to X-homolog domain 
with unknown function (22). In addition to these protein 
domains, some of SGS3-like proteins also contain a coil- 
coil domain localized between the XS and XH domains 
(15,24). Arabidopsis encodes 14 SGS3-like proteins 
including SGS3 and involved in de novo 2 (IDN2, also 
called RDM 12) (15,23,24). While SGS3 is an essential 
component of post transcriptional gene silencing (PTGS) 
required for the production of sense-transgene-induced 
siRNAs and trans-noting siRNAs (23,25), IDN2/RDM12 
is for RdDM and required for TGS (15,24). Both SGS3 
and IDN2 bind dsRNAs with a 5' overhang (15,26). 
However, the functions of remaining 12 SGS3-like 
proteins are still unknown. 

Here, we identify five SGS3 homologs, factor of DNA 
methylation (FDM) 1, 2, 3, 4 and 5, as important compo- 
nents of RdDM. Using a combination of transcriptome 
co-expression analysis and reverse genetics, we found that 
FDM1 and FDM2 display a highly correlated expression 
pattern with known components of RdDM. Both FDM1 
and FDM2 act redundantly in DNA methylation, accu- 
mulation of Pol V-dependent rasiRNAs and silencing of 
RdDM loci. However, FDM1 and FDM2 are not 
required for the accumulation of Pol V- and Pol 
II-dependent scaffold transcripts. Furthermore, we show 
that IDN2 and three uncharacterized SGS3-like proteins 
FDM3, FDM4 and FDM5 have overlapping function 
with FDM1 in RdDM. FDM2 also have redundant 
function with IDN2 in RdDM. These findings broaden 
our knowledge of RdDM and the function of the SGS3 
gene family. 

MATERIALS AND METHODS 

Plant materials 

The T-DNA insertional mutants, fdml-1 (SALK_075813) 
and fdm2-l (SAIL_291_F01) and idn2-3 (Salk_152144) 
were obtained from the ABRC Stock Center (www 
.arabidopsis.org). The T-DNA insertions were identified 
through combination of gene-specific primers and 
T-DNA left border primer (primers FDM1RP, 
FDM 1 LP and LBal for fdml-1; primers FDM2RP, 
FDM2LP and LB3 for fdm2-l; primers IDN2RP, 
IDN2LP and LBal for idn2-3; Supplementary Table S2). 
The fdml-1 fdm2-l, fdml-1 idn2-3, fdm2-l idn2-3 mutants 
were constructed by crossing single mutants, nrpel-1 (27), 
dcl3-l (6) and the my c- AGO 4 transgenic line were kindly 
gifts from Dr Xuemei Chen. Myc-AG04 is in the Ler 
genetic background, whereas other mutants are in the 
Columbia genetic background. 

Phylogenetic analyses 

Protein sequences for 14 Arabidopsis SGS3-like proteins 
were obtained from the Arabidopsis website (http://www. 



arabidopsis.org). Full-length protein sequences of 14 
SGS3-like proteins were aligned using CLUSTALW at 
The Biology Work Bench (http://workbench.sdsc.edu/). 
Phylogenetic analysis was done by the unrooted 
neighbor-joining method. To assess the degree of reliabil- 
ity for each branch on the tree, bootstrap confidence 
values of each node were calculated with 1000 replicates 
using PAUP 4.0 (http://paup.csit.fsu.edu/). 

DNA methylation assays 

Genomic DNA was extracted from flowers and digested 
overnight with different methylation-sensitive restriction 
enzyme (Haelll, Avail, Hpall and Mspl) or 1 h with 
McrBC. Approximately 5% of the digested DNA was 
subsequently used for PCR analysis of AtSNl, IGN5, 
FWA SINE and siR02. The undigested genomic DNA 
was amplified simultaneously as loading controls. PCR 
conditions were: 94°C for 30 s, 54°C for 30 s, 72°C for 1 
min, 32 cycles and 72°C for 10 min. For Southern blotting, 
5 jig of genomic DNA treated with Haelll, Hpall and 
Mspl overnight was resolved in 1.2% agarose gel and 
transferred to Hybond-N + membranes. 5S rDNA, 
MEA-ISR and AtMUl Southern blotting were carried 
out as described (18,27,28). Primers used for DNA methy- 
lation analyses were listed in the Supplementary Table S2. 
The primer information was obtained from references 
(14,18,27,28). 

RT-PCR analysis 

Total RNA was extracted from flowers using Trizol 
reagent (Sigma). After DNase treatment, 2-5 ug of total 
RNA was used to synthesize cDNA with Superscript III 
(Invitrogen) using oligo-dT or gene-specific primers. The 
diluted cDNA reaction mixture was used for RT-PCR of 
AtSNl, siR02 and 5s rRNA spacer as previously described 
(7,29). The constitutively expressed UBQ5 was used as an 
internal control. The cDNA reaction mixture without 
reverse transcriptase was used in PCR amplification to 
determine the absence of DNA contamination. Pol II- 
and Pol V-dependent transcripts were detected by RT- 
PCR according to (14). Primers used for RT-PCR 
analysis are listed in the Supplementary Table S2. 

siRNA and miRNA detection 

RNA isolation and hybridization were performed accord- 
ing to the method described by (30). siR1003, AtSNl, 
AtCopia2, SimpleHAT2, siR02, Cluster4 and TR2558 
were detected using S'-end-labeled ( 32 P) antisense LNA 
oligonucleotides (7). Probe and primer sequences are 
listed in the Supplementary Table S2. 

Immunolocalization 

Leaves from 28-day-old plants were harvested and the 
immunolocalization experiments were performed as 
described (8,31). 

RNA binding assay 

The RNA and DNA binding assays were performed 
as previously described (32). GST and a truncated form 
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of FDM1 (GST-FDM 1 AZH, amino acids 114-498) 
fused to GST were expressed in E. coli BL21 and 
purified as described by (33). The templates for RNA1, 
2 and 3 were produced by PCR using primers RNA1F/ 
1R, RNA2F/2R and RNA1F/3R, respectively. The 
template for RNA 1, 2 and 3 is the B region of the 
AtSNl locus. Primers RNA IF, RNA2R and RNA3R 
contain the T7 promoter. The RNAs were synthesized 
by in vitro transcription with T7 RNA polymerase at the 
presence or absence of [oc 32 P] UTP. RNA1 was used as 
ssRNAs in the binding assay. RNA1/RNA2 were 
annealed to generated dsRNAs with 5' overhangs at 
both ends. RNA3 is a dsRNA with 3' overhangs at 
both ends. Annealing was performed in the annealing 
buffer [10 mM Tris-HCl (pH 8.0), 20 mM NaCl, 
ImM EDTA (pH 8.0)] by incubating RNAs at 95°C 
for 5 min and then gradually cooling to room temperature. 
Sequences for primers are listed in the Supplementary 
Table S2. 



RESULTS 

Atlgl5910 and At4g00380 co-expressed with genes in 
the RdDM pathway 

Phylogenic analyses using full-length protein sequences 
assigned 14 Arabidopsis SGS3 family members into three 
subfamilies (Figure 1A; 34). SGS3 from the first subgroup 
and IDN2 from the second subgroup have been shown to 
act in PTGS and TGS, respectively (15,23,24). However, 
no members from the third subgroup were studied. To 
extend our understanding of SGS3-like proteins, we 
selected Atlgl5910 and At4g00380 from subgroup 3 for 
functional characterization as they contain the zf-XS, XS, 
XH and coil-coil domains (Figure IB). The protein se- 
quences of Atlgl5910 and At4g00380 are highly similar 
(93% identities and 96% similarities; Supplementary 
Figure SI), indicating that they might have redundant 
function. This was supported by the similar expression 
pattern between Atlgl5910 and At4g00380 in leaves, 




Figure 1. FDM1 and FDM2 are putative components of RdDM pathway. (A) Unrooted neighbor-joining phylogenies based on full-length amino 
acid sequences of 14 Arabidopsis SGS3 like proteins. Bootstrap values were given for branch node. Dark gray, subfamily 1; Light gray, subfamily 2; 
White, subfamily 3. (B) A scheme of protein structures of AUG15910 (FDM1) and At4G00380 (FDM2). Black box, the zf-XS domain; open box, 
the XS domain; Gray box, the coil-coil domain; hatched box, the XH domain. (C) RT-PCR analysis of AUG15910 and At4G00380 expression in 
root, leaf, flower and stem. Amplification of UBIQUITIN5 (At3g26650; UBQ5) with or without reverse transcription (RT) is shown as a control. 
(D) Correlation among several genes involved in RdDM pathway and FDM1/FDM2. Black circle, FDM1/FDM2; Open circle: genes involved in 
RdDM. solid black line, r>0.9; dot line: 0.9 >r> 0.830. Asterisk: Because of cross hybridization of IDN2 andAt4g01780 in the microarray 
experiment, they were considered as a single gene during co-expression analysis. (E) Diagrams of T-DNA-insertion in fdml-1 and fdm2-l, respect- 
ively. Black box, coding region; solid black line, intron; open triangle, T-DNA insertion site. Gray arrowheads, primer used for T-DNA genotyping; 
Black arrowheads, primer used for RT-PCR analysis. (F) RT-PCR analysis of FDM1 and FDM2 expression in fdml-1, fdm2-l and Col (wild-type: 
WT). Amplification of UBQ5 with or without RT is shown as a control. 
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flowers, stem and roots (Figure 1C). However, they dis- 
played altered expression levels in leaves, flowers, stem 
and roots, suggesting that their expression may be devel- 
opmentally regulated (Figure 1C). 

To infer the functions of Atlgl5910 and At4g00380, we 
searched for their co-expression genes within the 
ATTED-II developmental expression data set using a 
co-expression analysis program at the RIKEN PRIMe 
website (35,36). This search was based on the hypothesis 
that genes involved in a particular biological process often 
share regulatory systems thus having a similar expression 
pattern (37). Because of cross-hybridization between 
Atlgl5910 and At4g00380 in the microarray experiments, 
they were considered as a single gene in the analysis. The 
results showed that Atlgl5910/At4g00380 had a very 
strong correlation with AG04, NRPE1, DRD1, DMS3, 
IDN2 and RDR2 (correlation coefficiency r>0.83; 
Figure ID). These RdDM genes were coordinately 
expressed with Atlgl5910/At4g00380 in roots, embryos, 
siliques, leaves, stems and flowers (Supplementary Figure 
S2), according to the Arabidopsis eFP-Browser, which was 
developed to interpret gene expression data of Arabidopsis 
(38). Atlgl5910/ At4g00380 also had a considerably high 
correlation with DCL3, NRPD1 and DRM2 (0.68 < 
r<0.83; Supplementary Table SI) as their expression 
was overlapped in various tissues and at different devel- 
opment stages (Supplementary Figure S2). Altogether, 
these results showed the correlation between Atlgl5910/ 
At4g00380 and known genes involved the RdDM 
pathway, and therefore, suggested their potential role in 
RdDM. We named Atlgl 5910 and At4g00380 factor of 
DNA methylation 1 (FDM1) and factor of DNA methy- 
lation 2 (FDM2), respectively, because we subsequently 
showed that they acted in RdDM (see below). 

FDM1 and FDM2 have redundant and essential 
roles in RdDM 

To examine the function of FDM1 and FDM2, two 
T-DNA insertion lines, SALK_075378 for FDM1 (39) 
and SAIL_291_F01 for FDM2 (40) were obtained from 
the Arabidopsis stock center (http://www.arabidopsis.org) 
and further characterized. As a first step, plants homozy- 
gous for SALK_075378 (named fdml-1) and 
SAIL_291_F01 (named fdm2-l) were identified by PCR 
geno typing (Supplementary Figure S3). Sequence 
analysis of the flanking regions of the T-DNA revealed 
that fdml-1 contained a T-DNA insertion in the first 
intron (949 bp downstream from the ATG site) of 
FDM1 and fdm2-l harbored a T-DNA insertion in the 
fifth intron (2252 bp downstream from the ATG site) of 
FDM2 (Figure IE). Using RT-PCR analysis, we failed to 
detect the transcripts of FDM1 and FDM2 in fdml-1 and 
fdm2-l (Figure IF), respectively, indicating that they are 
potentially null alleles of FDM1 and FDM2. As FDM1 
and FDM2 might have redundant functions, we con- 
structed a fdml-1 fdm2-l double mutant by crossing the 
two respective single mutant lines. No obvious phenotypic 
abnormalities were observed in fdml-1, fdm2-l and fdml-1 
fdm2-l (Supplementary Figure S4). 



To evaluate whether FDM1 and FDM2 have roles in 
the RdDM pathway, we examined DNA methylation 
status at known RdDM-regulated retrotransposon such 
as AtSNl and ING5 in fdml-1, fdm2-l, fdml-1 fdm2-l 
and Arabidopsis ecotype Columbia (wild-type control; 
WT) plants by using methylation sensitive Haelll restric- 
tion enzyme digestion followed by PCR that identifies 
CHH methylation. Haelll cannot cleave ATSN1 and 
ING5 DNAs from WT due to DNA methylation at its 
cleavage site (14,41). A reduction in DNA methylation 
will cause AtSNl and ING5 DNAs to be less resistant to 
Haelll cleavage, resulting in reduced or undetectable PCR 
products (14,41). As shown in Figure 2A, fdml-1 but not 
fdm2-l showed a moderate reduction of DNA methylation 
at AtSNl and ING5 loci relative to WT. A reduction of 
DNA methylation at short interspersed repetitive elements 
upstream of FWA gene (FWA SINE) in fdml-1 but not in 
fdm2-l was also detected by methylation sensitive Avail 
enzyme digestion analysis (Figure 2 A) (27). The reduction 
of DNA methylation in fdml-1 but not fdm2-l may be 
correlated with the reduction of FDM2 transcript abun- 
dance in fdml-1 and increased FDM1 transcript levels in 
FDM2-1 (Figure IF). Introducing the WT FDM1 genomic 
DNA into fdml-1 fully recovered the DNA methylation 
levels at the AtSNl locus (Supplementary Figure S5A), 
demonstrating that the reduction in DNA methylation 
in fdml-1 is due to FDM1 loss-of-function. The restriction 
digestion patterns of AtSNl, ING5 and FWA SINE 
DNAs in fdml-1 fdm2-l were similar to nrpel-1, 
indicating a strong loss of DNA methylation at these 
loci (Figure 2A). The reduction of DNA methylation at 
AtSNl locus in fdml-1 fdm2-l was further confirmed by 
McrBC enzyme digestion followed by PCR (Figure 2B). 
The McrBC enzyme cuts methylated but not unmethy- 
lated DNA. A reduction in DNA methylation will result 
in increased PCR products after McrBC treatment. This 
assay also revealed a reduction in DNA methylation at the 
siR02 locus in fdml-1 fdm2-l (Figure 2B). We further 
examined the DNA methylation status of 5S rDNA, 
AtMUl and MEA-ISR using the methylation-sensitive re- 
striction enzyme Haelll, Hpall (for CG and CHG methy- 
lation) and Mspl (for CG methylation) followed by 
Southern blotting (18,27,28). A strong reduction in 
DNA methylation at 5S rDNA, AtMUl and MEA-ISR 
loci comparable to nrpel-1 was observed in fdml-1 
fdm2-l but not in fdml-1 and fdm2-l (Figure 2C-E). 
Next, we examined the methylation status of the highly 
repetitive 180-bp centromeric repeat that is not an RdDM 
target (11). The DNA methylation at this locus showed no 
obvious alteration in fdml-1 fdm2-l and nrpel-1 
compared with WT (Figure 2F). This indicated that the 
function of FDM1 and FDM2 in DNA methylation is 
rasiRNA dependent. To confirm that the strong reduction 
of DNA methylation in fdml-1 fdm2-l is due to lack of 
both FDM1 and FDM2, we introduced the WT FDM1 or 
FDM2 genomic DNA into fdml-1 fdm2-l. Two randomly 
chosen transgenic fdml-1 fdm2-l lines harboring the 
FDM1 transgene showed comparable DNA methylation 
levels at AtSNl and ING5 with WT and fdm2-l, while two 
fdml-1 fdm2-l lines containing the FDM2 transgene have 
similar DNA methylation levels to fdml-1 (Supplementary 
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Figure S5B). These results demonstrated that FDM1 and 
FDM2 act redundantly in RdDM. 

Next, we examined the expression levels of AtSNl, 5s 
rRNA spacer and siR02 in fdml-1, fdrn2-l and fdml-1 
fdm2-l, nrpel-1 and WT by RT-PCR. Their transcripts 
in fdml-1 fdm2-l but not in fdml-1 and fdm2-l were sig- 
nificantly increased to levels comparable to nrpel-1 
(Figure 3 A and B). These results revealed that the reduc- 
tion of DNA methylation in fdml-1 fdm2-l is correlated 
with derepression of RdDM target loci. 

The levels of Pol V-dependent rasiRNAs are 
reduced in fdml-1 fdm2-l 

Based on their dependence on Pol V and Pol IV, 
rasiRNAs are classified into two types (27). The 



accumulation of type I rasiRNAs that are derived from 
highly repetitive DNA sequences, including AtSNl, 
siR1003 (from 5S rDNA), AtREP2, SimpleHAT2 and 
AtCopia2, depends on both Pol V and Pol IV, whereas 
the levels of type II rasiRNAs generated from low-copy 
number DNA repeats, such as siR02, Cluster4, TR2558, 
Cluster2 and soloLTR, require Pol IV but not Pol V (27). 

We examined the accumulation of both type I 
rasiRNAs and type II rasiRNAs in fdml-1, fdm2-l and 
fdml-1 fdm2-l by northern blotting. The accumulation of 
both type I rasiRNAs (AtSNl, siRNA 1003, Atcopia and 
SimpleHAT2) and type II rasiRNAs (siR02, Cluster4, 
TR2558) was reduced in dcB-1 but not in fdml-1 and 
fdm2-l relative to WT (Figure 3D and E). Like in 
nrpel-1, the accumulation of type I but not type II 
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rasiRNAs was significantly reduced in fdml-1 fdm2-l 
compared with WT (Figure 3D and E). These results sug- 
gested that FDM1 and FDM2 act redundantly to promote 
the accumulation of type I rasiRNAs but not type II 
rasiRNAs. We next tested whether FDM1 and FDM2 
were involved in the accumulation of micro RN As 
(miRNAs). However, the levels of DCL1 -dependent 
miR172 and miR173 in fdml-1, fdm2-l and fdml-1 
fdm2-l were similar to those in WT (Supplementary 
Figure S6). 

FDM1 and FDM2 are not required for the localization 
of NRPD1, RDR2, NRPE1 and AG04 and for the 
accumulation of Pol V- or Pol II-dependent 
non-coding transcripts 

To explore the role of FDM1 and FDM2 in RdDM, we 
examined the nuclear localization of NRPD1, RDR2, 
NRPE1 and AG04 in fdml-1 fdm2-L As shown in 
Figure 4, in both WT and fdml-1 fdm2-l nuclei NRPD1 
displayed punctate foci signals in the nucleoplasm. In 
contrast, as previously reported (31,42), RDR2, NRPE1 
and AG04 showed a round-shaped nucleolar signal in 
addition to puncta or diffuse signals outside the nucleolus 
both in WT and fdml-1 fdm2-l (Figure 4). Thus, the 



fdml-1 and fdm2-l double mutations have no effects on 
the localization of the RdDM players NRPD1, NRPE1, 
RDR2 and AG04. 

Next, we tested the requirement of FDM1 and FDM2 
for the accumulation of Pol V- or Pol II-dependent 
non-coding transcripts that serve as scaffolds to recruit 
AG04-siRNA complex to chromatin (7,43). RT-PCR 
analyses showed that the Pol V-dependent transcripts at 
AtSNl locus (interval B) and Pol II-dependent transcripts 
at siR02 locus (interval B) were not affected in fdml-1 
fdm2-l (Figure 3C). 

FDM1 binds dsRNAs with 5 ; overhangs 

As the SGS3 and IDN2 have been shown to bind 
dsRNAs, we tested whether FDM1 is an RNA-binding 
protein using a GST-pull down assay. Because the 
truncated SGS3 and IDN2 proteins containing the XS 
and coil-coil domains are able to bind dsRNAs, we ex- 
pressed a truncated version of FDM1 lacking the zinc 
finger and XH domain fused with GST tag at its 
N-terminus (GST-FDM1 AZH) and a GST control 
protein in E. coli. The GST-FDM1 AZH and GST 
proteins were purified with glutathione beads (Figure 
5A). We prepared various radioactive-labeled RNA 
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Figure 4. The fdml-1 and fdm2-l double mutations have no effects on 
RdDM proteins nuclear localization. NRPD1, RDR2, NRPE1 and 
AG04. Peptide antibodies specifically recognizing native NRPD1, 
RDR2, NRPE1 or AG04 (in red) were used to perform immuno- 
localization experiments in Arabidopsis leaf nuclei from ecotype 
Columbia (WT) and fdml-1 fdm2-l mutant line. DNA was counter- 
stained with DAPI. Scale bar corresponds to 5|im. 
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Figure 5. FDM1 binds dsRNAs with 5' overhangs. (A) The two 
purified proteins used in the binding assay, GST and GST- 
FDM1AZH (truncated FDM1 containing XS and SMC domain) 
were resolved in SDS-PAGE gel and stained with coomassie Blue. 
The protein molecular weights are indicated on the right. (B and C) 
RNA-binding assays of FDM1 with various probes. The structure of 
various probes is shown on the right. Asterisk indicates radioactive 
labeled RNA strand. Approximately 50 jig of protein was used for 
the binding assay. For dsRNAs with 5' overhang, lx, lOx and 150x 
unlabeled RNAs of the same sequence were used for the competition 
assay. 



overhangs at each end but not 53 nt ssRNAs and a 
36 bp dsRNA with 17nt 3' overhang at each end, 
whereas GST alone did not bind any RNA species 
(Figure 5B and C). Furthermore, addition of unlabeled 
dsRNAs of the same sequence efficiently reduced the 
binding of radioactive probe by GST-FDM1 AZH 
(Figure 5C). These results demonstrated that FDM1 
binds dsRNAs with 5' overhangs. 

RNA-mediated in vitro AG04-FDM1 interaction 

We next tested whether FDM1 interacts with AG04 and 
RDR2 by in vitro protein pull-down assay in order to gain 
insight on the function of FDM1 in RdDM. A full-length 
FDM1 fused with a GST-tag at its N-termini was ex- 
pressed in E. coli and purified with glutathione beads 
(Supplementary Figure S7). The glutathione beads 
conjugated with GST-FDM1 were incubated proteins 
extracts containing HA-RDR2 or MYC-AG04. Western 
blot detected the enrichments of MYC-AG04 but not 
HA-RDR2 in the GST-FDM1 complex (Supplementary 
Figure S7). In contrast, the control GST protein 
alone failed to pull down MYC-AG04 (Supplementary 
Figure S7). Because both AG04 and FDM1 are RNA 
binding proteins, we tested whether the interaction is 
RNA-mediated. RNase A treatment abolished AG04- 
FDM1 interaction (Supplementary Figure S7). 

FDM1 and FDM2 have overlapping functions with IDN2 
in the RdDM pathway 

Because FDM1 and FDM2 protein sequences share con- 
siderable similarities with that of IDN2/RDM12 (~60%) 
and all of them are involved in RdDM, we asked whether 
they have overlapping functions. We obtained a T-DNA 
insertion line Salk_l 52144 for IDN2/RDM12 from the 
Arabidopsis stock center and identified homozygous 
mutants by PCR genotyping (Supplementary Figure S8). 
We named this line idn2-3. The transcript levels of IDN2 
were reduced in idn2-3 (Supplementary Figure S8C), 
resulting in a moderate reduction in DNA methylation 
at AtSNl and ING5 loci (Figure 6A). We constructed 
two double mutants, fdml-1 idn2-3 and fdm2-2 idn2-3 
by crossing single mutants and analyzed DNA methyla- 
tion status at AtSNl and ING5 loci. Like fdml-1 fdm2-f 
fdml-1 idn2-3 and fdm2-l idn2-3 showed strong reduction 
in DNA methylation compared with each of single 
mutants (Figure 6A). It was noticed that the fdml-1 
idn2-3 showed a stronger reduction in DNA methylation 
at IGN5 locus than fdml-1 fdm2-l and fdm2-l idn2-3. 
This result may be related to the reduced expression of 
FDM2 in the fdml-1 genetic background (Figure IF). 
fdml-1 idn2-3 also displayed reduced DNA 
methylation at 5S rDNA locus relative to fdml-1 and 
idn2-3 (Figure 6B). 



species including ssRNAs, dsRNAs with 3' overhangs and 
dsRNAs with 5' overhangs (Figure 5B and C). These 
probes were incubated with the glutathione beads contain- 
ing GST-FDM1 AZH or GST alone. GST-FDM1 AZH 
retained radioactive 35 bp dsRNAs with 18nt 5' 



FDM3, FDM4 and FDM5 act redundantly with 
FDM1 in RdDM 

IDN2/RDM12, FDM1 and FDM2 have three additional 
homologs At3G12550 (subfamily 2), Atlgl3790 (sub- 
family 2) and Atlg80790 (subfamily 3) that contain 
all four-signature motifs of SGS3 protein family in 
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Figure 6. FDM1 has overlapping functions with IDN2, FDM3, FDM4 
and FDM5. (A) DNA methylation levels at AtSNl and IGN5 loci in 
various genotypes. Haelll-digested genomic DNAs were used for PCR 
amplification of ATSN1 and ING5. Amplification of undigested DNAs 
was used as loading controls. (B) DNA methylation at 5S rDNA locus 
in various genotypes. 

Arabidopsis. We named these proteins FDM3, FDM4 and 
FDM5 respectively and tested whether they have func- 
tions in RdDM. Homozygous T-DNA insertion lines 
Salk_020841 for At3G12550 (fdm3-l\ Salk_008738 
(fdm4-l) for AtlG13790 and Salk_052192 (fdm5-l) for 
AtlG80790 were obtained from Arabidopsis center 
(Supplementary Figure S8). No transcripts for FDM3, 
FDM4 were detected in fdm3-l and fdm4-l, respectively, 
whereas the abundance of FDM5 transcripts was reduced 
significantly in fdm5-l (Supplementary Figure S8). The 
DNA methylation status of ATSN1, ING5 and 5S 
rDNA loci in fdm3-l,fdm4-l and fdm5-l showed no alter- 
ation relative to WT. We next tested whether FDM3, 
FDM4 and FDM5 have redundant functions with 
FDM1. In facts, the DNA methylation contents of 
ATSN1, ING5 and 5S rDNA loci are strongly reduced in 
fdml-1 fdm3-l, fdml-1 fdm4-l, fdml-1 fdm5-l compared 
with each of single mutants and WT. In fdml-1 fdm3-l, 
fdml-1 fdm4-l and fdml-1 fdm5-l expressing the FDM3, 
FDM4 and FDM5 transgenes under the control of their 
native promoters, respectively, the DNA methylation 
content of ATSN1 and ING5 is comparable with that in 
fdml-1, indicating that lack of FDM3, FDM4 or FDM5 is 
responsible for the enhanced DNA methylation defects in 
the double mutants (data not shown). 



DISCUSSION 

The SGS3-like genes encode a large uncharacterized 
protein family. In this study, through a combination of 
transcriptome co-expression analysis, reverse genetics and 



biochemical assays, we show that two SGS3-like proteins 
FDM1 and FDM2 from Arabidopsis are essential compo- 
nents of gene silencing triggered by small RNAs. FDM1 
and FDM2 share high similarity and lack of both of them 
causes great reduction in DNA methylation levels and Pol 
V-dependent rasiRNA accumulation, resulting in release 
of TGS. These results demonstrate that FDM1 and 
FDM2 have essential and redundant roles in the RdDM 
pathway. 

Co-expression analysis revealed that AGO 4, NRPEf 
DRD1, DMS3, IDN2/RDM12, FDM1/FDM2 DCL3 and 
RDR2 are highly correlated with each other (r > 0.76; 
Figure ID and Supplementary Table SI). NRPD1 and 
DRM2 also display considerable correlation with these 
genes (r>0.6 and r>0.5; respectively; Supplementary 
Table SI). These results are supported by their 
coordinated high expression at DNA-replication active 
tissues such as inflorescence meristem, shoot meristem 
and developing embryo (Supplementary Figure S2), 
which agrees with their role in directing de novo DNA 
methylation (1,2). The correlation among genes involved 
in RdDM indicates that they may share a common regu- 
latory system and tend to be co-expressed. Consequently, 
searching for co-expressed genes combined with reverse 
genetic analysis could be a powerful tool to identify 
novel genes that are involved in RdDM, especially those 
with functional redundancy. 

How do FDM1 and FDM2 function in RdDM? They 
appear not to be required for the correct localization of 
NRPD1, RDR2, NRPE1 and AG04, as these proteins 
have similar localizations in fdml-1 fdm2-l as in WT 
(Figure 4). Like IDN2 and SGS3 (15,26), FDM1 binds 
dsRNAs with 5 ; overhangs (Figure 5). Given its 
sequence similarity and functional redundancy with 
FDM1, FDM2 most likely interacts with dsRNA with 5 ; 
overhangs too. These observations suggest at least two 
hypotheses for FDM1 and FDM2 function, as indicated 
for IDN2/RDM12 (15,24). The first is that FDM1 and 
FDM2 may bind dsRNA produced by RDR2 to stabilize 
it, which may be required for rasiRNA biogenesis (24). 
The second is that FDM1 may interact with AG04- 
bound dsRNAs generated by base pairing between 
rasiRNAs and target transcripts produced by Pol II or 
Pol V to stabilize rasiRNA-target interaction or recruit 
downstream components such as DRM2 to chromatin 
(15,24). fdml-1 fdm2-l displayed reduced DNA methyla- 
tion levels of both types I and II rasiRNA generating loci 
(Figure 2) as well as reduced amount of type I rasiRNAs 
but not type II rasiRNAs (Figure 3). These molecular 
phenotypes of fdml-1 fdm2-l resemble those of nrpel, 
ago4, rdml and drdf indicating that like NRPE1, 
AG04, DRD1 and RDM1, FDM1 and FDM2 may act 
downstream of ra-siRNA initiation in RdDM. In 
addition, FDM1 and FDM2 are not required for the ac- 
cumulation of both Pol V-dependent and Pol Independent 
scaffold transcripts, indicating FDM1 and FDM2 may act 
downstream of Pol V and Pol II activities. Thus, we favor 
the suggestion that FDM1/FDM2 binds the rasiRNA- 
target duplex. In fact, an RNA-mediated AG04-FDM1 
association is observed, whereas an RDR2-FDM1 inter- 
action is not detected (Supplementary Figure S7). 
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The Arabidopsis genome encodes 14 SGS3-like proteins 
(34) that can be assigned into three subfamilies. IDN2 and 
FDM 1 /FDM2 belong to subfamily 2 and 3, respectively 
(Figure 1A). However, their protein sequences are very 
similar (~60% similarity), indicating that they may have 
closely related functions. This notion is strongly supported 
by the facts that fdml-1 idn2-3 and fdm2-l idn2-3 show 
much stronger reduction in DNA methylation than each 
of single mutants (Figure 5). Arabidopsis encodes six 
SGS3-like proteins from family 2 and family 3, including 
IDN2, FDM1 and FDM2, FDM3, FDM4 and FDM5, 
which contain all four-signature domains of SGS3-like 
proteins. The double mutant analyses reveal that FDM3, 
FDM4 and FDM5 have redundant roles with FDM1 in 
RdDM (Figure 6). Thus our study defines a group of 
SGS3-like proteins that play important roles in RdDM. 
Clearly, further work is required to determine their mo- 
lecular role in RdDM. 
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